This file contains the supplementary data from the Covidex app. All graphs and some tables are interactive and the reader can explore the data.
First, we present some basic stats from the training and testing datasets.
| Classification model | Training date | Sequences | Number of subtypes | Number of trees | mtry | Oob error rate |
| Rambaut et al nomenclature | 2020-06-05 | 10367 | 78 | 1000 | 300 | 0.0215 |
| Classification model | Sequences | Number of subtypes | Error | Multi-class AUC |
| Rambaut et al nomenclature | 2042 | 78 | 0.0137 | 0.9958 |
In the following table evaluation metrics for each class are presented:
Click here to see table captions Sensitivity or Recall: the proportion of true positive cases which were correctly identified. Specificity: the proportion of true negative cases which were correctly identified. Positive Predictive Value (PPV): the proportion of positive cases that were correctly identified. Negative Predictive Value (NPV): the proportion of negative cases that were correctly identified. Precision: the proportion of predicted positive cases which were correctly identified.
F1: the harmonic mean of precision and recall values. Prevalence: the proportion of the total of cases which are actual positive cases.
Detection Rate: the proportion of the total of cases wich are correctly identified positive cases.
Detection Prevalence: the proportion of the total of cases which were predicted as positive cases. Balanced Accuracy: the arithmetic mean of sensitivity and specificity.
The Precision-Recall curve shows the good performance of the method